Search CORE

10 research outputs found

Additional file 1: of Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Author: Agostinho Antunes (47525)
Alberto Fernández (2962992)
Deborah Galpert (5173559)
Francisco Herrera (3763534)
Guillermin Agüero-Chapin (199484)
Reinaldo Molina-Ruiz (199491)
Publication venue
Publication date
Field of study

Proteome fasta files for the following yeast species: S. cerevisiae, C. glabrata, K. waltii and K lactis. (ZIP 5844 kb

FigShare

Exploring the Adenylation Domain Repertoire of Nonribosomal Peptide Synthetases Using an Ensemble of Sequence-Search Methods

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date: 16/07/2013
Field of study

<div>The introduction of two-dimension (2D) graphs and their numerical characterization for comparative analyses of DNA/RNA and protein sequences without the need of sequence alignments is an active yet recent research topic in bioinformatics. Here, we used a 2D artificial representation (four-color maps) with a simple numerical characterization through topological indices (TIs) to aid the discovering of remote homologous of Adenylation domains (A-domains) from the Nonribosomal Peptide Synthetases (NRPS) class in the proteome of the cyanobacteria Microcystis aeruginosa. Cyanobacteria are a rich source of structurally diverse oligopeptides that are predominantly synthesized by NPRS. Several A-domains share amino acid identities lower than 20 % being a possible source of remote homologous. Therefore, A-domains cannot be easily retrieved by BLASTp searches using a single template. To cope with the sequence diversity of the A-domains we have combined homology-search methods with an alignment-free tool that uses protein four-color-maps. TI2BioP (Topological Indices toBioPolymers) version 2.0, available at <a href="http://ti2biop.sourceforge.net/" target="_blank">http://ti2biop.sourceforge.net/</a> allowed the calculation of simple TIs from the protein sequences (four-color maps). Such TIs were used as input predictors for the statistical estimations required to build the alignment-free models. We concluded that the use of graphical/numerical approaches in cooperation with other sequence search methods, like multi-templates BLASTp and profile HMM, can give the most complete exploration of the repertoire of highly diverse protein families.</div

Directory of Open Access Journals

PubMed Central

FigShare

Testing different topologies for the MLP on the A-domain classification using TIs from four-color maps.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

Accuracy performance and error on training, selection and test sets.</p

FigShare

From the protein sequence to its numerical characterization.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

(A) The first nine aminoacids of pdb 1AMU. (B and C) Building the four-color map for A. (D) The definition of the node adjacency matrix derived from C the four-color map.</p

FigShare

True positives vs. false positives in the A-domain detection for different sequence-search methods among the overall dataset involved in the study.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

True positives vs. false positives in the A-domain detection for different sequence-search methods among the overall dataset involved in the study.</p

FigShare

Classification results for the three alignment-free models (GDA, DTM and ANN) in A-domains detection.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

Classification results for the three alignment-free models (GDA, DTM and ANN) in A-domains detection.</p

FigShare

Architecture for the DTM. Decision Nodes are represented in blue and terminal nodes are in red.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

A-domains are labeled using an intermittent line. Otherwise CATH domains are signed by a continuous line. Labels at the right-corner of the nodes indicate tentative membership to A or CATH domain class. Numbers at the left-corner represent the node's number.</p

FigShare

Assessing the relationship between the number of TIs entered in each model and the Wilk's (λ) values obtained for each one.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

Assessing the relationship between the number of TIs entered in each model and the Wilk's (λ) values obtained for each one.</p

FigShare

Dot plot for the global sequence identity matrix obtained by Needleman-Wunsch algorithm for A-domains.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

(A) All A-domains involved in the study. (B) A-domains of the test set.</p

FigShare

Re-annotation of the A-domains in the proteome of Microcystis aeruginosa by using an ensemble of algorithms.

Author: Agostinho Antunes (47525)
Aminael Sánchez-Rodríguez (199486)
Emanuel Maldonado (434088)
Guillermin Agüero-Chapin (199484)
Gustavo de la Riva (434089)
Reinaldo Molina-Ruiz (199491)
Vitor Vasconcelos (108651)
Publication venue
Publication date
Field of study

Re-annotation of the A-domains in the proteome of Microcystis aeruginosa by using an ensemble of algorithms.</p

FigShare

Additional file 1: of Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Exploring the Adenylation Domain Repertoire of Nonribosomal Peptide Synthetases Using an Ensemble of Sequence-Search Methods

Testing different topologies for the MLP on the A-domain classification using TIs from four-color maps.

From the protein sequence to its numerical characterization.

True positives <i>vs</i>. false positives in the A-domain detection for different sequence-search methods among the overall dataset involved in the study.

Classification results for the three alignment-free models (GDA, DTM and ANN) in A-domains detection.

Architecture for the DTM. Decision Nodes are represented in blue and terminal nodes are in red.

Assessing the relationship between the number of TIs entered in each model and the Wilk's (λ) values obtained for each one.

Dot plot for the global sequence identity matrix obtained by Needleman-Wunsch algorithm for A-domains.

Re-annotation of the A-domains in the proteome of <i>Microcystis aeruginosa</i> by using an ensemble of algorithms.